Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets
نویسندگان
چکیده
The long short-term memory (LSTM) network trained by gradient descent solves difficult problems which traditional recurrent neural networks in general cannot. We have recently observed that the decoupled extended Kalman filter training algorithm allows for even better performance, reducing significantly the number of training steps when compared to the original gradient descent training algorithm. In this paper we present a set of experiments which are unsolvable by classical recurrent networks but which are solved elegantly and robustly and quickly by LSTM combined with Kalman filters.
منابع مشابه
Learning Context Sensitive Languages with LSTM Trained with Kalman Filters
Unlike traditional recurrent neural networks, the Long ShortTerm Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ≤ 10) of the context sen...
متن کاملImproving Long-Term Online Prediction with Decoupled Extended Kalman Filters
Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform traditional RNNs when dealing with sequences involving not only short-term but also long-term dependencies. The decoupled extended Kalman filter learning algorithm (DEKF) works well in online environments and reduces significantly the number of training steps when compared to the standard gradient-descent algorithms. Prev...
متن کاملComplex Extended Kalman Filters for Training Recurrent Neural Network Channel Equalizers
The Kalman filter was named after Rudolph E. Kalman published in 1960 his famous paper (Kalman, 1960) describing a recursive solution to the discrete-data linear filtering problem. There are several tutorial papers and books dealing with the subject for a great variety of applications in many areas from engineering to finance (Grewal & Andrews, 2001; Sorenson, 1970; Haykin, 2001; Bar-Shalom & L...
متن کاملConvolutional LSTM Networks for Subcellular Localization of Proteins
Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networ...
متن کاملThe Optimization of Forecasting ATMs Cash Demand of Iran Banking Network Using LSTM Deep Recursive Neural Network
One of the problems of the banking system is cash demand forecasting for ATMs (Automated Teller Machine). The correct prediction can lead to the profitability of the banking system for the following reasons and it will satisfy the customers of this banking system. Accuracy in this prediction are the main goal of this research. If an ATM faces a shortage of cash, it will face the decline of bank...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neural networks : the official journal of the International Neural Network Society
دوره 16 2 شماره
صفحات -
تاریخ انتشار 2003